Adaptation of the Translation Model for Statistical Machine Translation based on Information Retrieval

نویسندگان

Almut Silja Hildebrand

Matthias Eck

Stephan Vogel

Alex Waibel

چکیده

In this paper we present experiments concerning translation model adaptation for statistical machine translation. We develop a method to adapt translation models using information retrieval. The approach selects sentences similar to the test set to form an adapted training corpus. The method allows a better use of additionally available out-of-domain training data or finds in-domain data in a mixed corpus. The adapted translation models significantly improve the translation performance compared to competitive baseline systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...

متن کامل

Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval

Language modeling is an important part for both speech recognition and machine translation systems. Adaptation has been successfully applied to language models for speech recognition. In this paper we present experiments concerning language model adaptation for statistical machine translation. We develop a method to adapt language models using information retrieval methods. The adapted language...

متن کامل

NTT statistical machine translation system for IWSLT 2010

In this year’s IWSLT evaluation campaign (TALK task), we applied three adaptation techniques: (1) training data selection based on information retrieval approach, (2) subsentence segmentation, and (3) language model adaptation using source-side of the test set. We also applied a sequential labeling method based on conditional random fields for restoring punctuation markers in the ASR input cond...

متن کامل

Domain Adaptation of Statistical Machine Translation Models with Monolingual Data for Cross Lingual Information Retrieval

Statistical Machine Translation (SMT) is often used as a black-box in CLIR tasks. We propose an adaptation method for an SMT model relying on the monolingual statistics that can be extracted from the document collection (both source and target if available). We evaluate our approach on CLEF Domain Specific task (German-English and English-German) and show that very simple document collection st...

متن کامل

A Hybrid Machine Translation System Based on a Monotone Decoder

In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

Adaptation of the Translation Model for Statistical Machine Translation based on Information Retrieval

نویسندگان

چکیده

منابع مشابه

A new model for persian multi-part words edition based on statistical machine translation

Language Model Adaptation for Statistical Machine Translation Based on Information Retrieval

NTT statistical machine translation system for IWSLT 2010

Domain Adaptation of Statistical Machine Translation Models with Monolingual Data for Cross Lingual Information Retrieval

A Hybrid Machine Translation System Based on a Monotone Decoder

عنوان ژورنال:

اشتراک گذاری